IAES International Journal of Artificial Intelligence ([J-AT) 
Vol. 10, No. 4, December 2021, pp. 879~888 
ISSN: 2252-8938, DOI: 10.1159 1/ijai.v10.14.pp879-888 0 879 


Android based application for visually impaired using deep 


learning approach 


Haslinah Mohd Nasir'!, Noor Mohd Ariff Brahin?, Mai Mariam Mohamed Aminuddin’, 


Mohd Syafiq Mispan*, Mohd Faizal Zulkifli> 


'2.45Rakulti Teknologi Kejuruteraan Elektrik dan Elektronik, Universiti Teknikal Malaysia Melaka, Malaysia 
3Fakulti Kejuruteraan Elektronik dan Kejuruteraan Komputer, Universiti Teknikal Malaysia Melaka. Malaysia 


Article Info 


ABSTRACT 


Article history: 


Received Dec 10, 2020 
Revised Jul 8, 2021 
Accepted Aug 29, 2021 


Keywords: 


Aided engineering 

Android application 
Convolution neural network 
Deep learning 


People with visually impaired had difficulties in doing activities related to 
environment, social and technology. Furthermore, they are having issues with 
independent and safe in their daily routine. This research propose deep 
learning based visual object recognition model to help the visually impaired 
people in their daily basis using the android application platform. This 
research is mainly focused on the recognition of the money, cloth and other 
basic things to make their life easier. The convolution neural network (CNN) 
based visual recognition model by TensorFlow object application 
programming interface (API) that used single shot detector (SSD) with a pre- 
trained model from Mobile V2 is developed at Google dataset. Visually 
impaired persons capture the image and will be compared with the preloaded 
image dataset for dataset recognition. The verbal message with the name of 
the image will let the blind used know the captured image. The object 


Visually impaired recognition achieved high accuracy and can be used without using internet 
connection. The visually impaired specifically are largely benefited by this 
research. 
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1. INTRODUCTION 

According to 2018 statistics from world health organization (WHO), at least 2.2 billion people from 
all over the world have vision impairment [1]. The visually impaired people face a lot of problems in their 
daily life. These people have difficulties to recognize and differentiate the objects around them thus they only 
rely on the guidance to help them especially for daily task. The most challenging for visually impaired are 
that the ability of them to recognize the colour, shape and differentiate the currency of the money. Nowadays, 
there are many of assistive technology that can help them as a sighted guidance and improves the quality life 
of visually impaired [2], [3]. The assistive technology based on computer is expected to help on the visually 
impaired daily task. It can be screen reading software, magnification software, dictation software, refreshable 
Braille displays, optical character recognition (OCR) systems, and many more [4]. The assistive technology 
growth from the simple technology devices to the sophisticated high technology solution using [5]-[7]. 

The visually impaired mostly now have the smartphone as it become a basic necessity of each 
individual. Thus, it provides a great platform to develop an application specifically for visually impaired to 
assist them. A survey done by Nora and the team have found that the person with visual impairments are 
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frequently used the mobile application for their daily activities [8]. Furthermore, they are looking for some 
improvement and new application that can help them to less dependent on others. An experimental study has 
been done by Manduchi on the specific mobile task as such landmark detection using the mobile phone. His 
findings become the platform on designing the technology that facilitates visually impaired [9]. 

Many studies have developed the assistive technology using android application which useful in 
today world and people with visually impaired can make use of this technology to help them accomplished 
on their daily routine. Tharkude et al. [10] and Parkhi ef al. [11], the authors proposed a smart android 
application for blind people with the use of object detection. The developed apps are using mobile video 
camera to know the direction of object, voice instructions for current locations and directions as well as to 
gives warning of the obstacles in front of the user [10]. While in [11], the authors design an object detection 
from imaged captured by smartphone’s camera. The performance of the apps is quite good however it is 
depending on the quality of the inbuilt camera of the smartphone. Kadam ef al. [12], the authors designed an 
apps providing with the speech output for the objects detected by using ANN classification approach. 
However, the apps may give variable accuracy and still need improvement to be more efficient. Other than 
that a mobile application called Intelligent Eye with features of light, colour detection and object, banknote 
recognition is developed by using image deep learning, CNN architecture [13]. The survey of User 
acceptability test has been done and the results show that the apps is good in general and well accepted. 

Deep learning has outstanding performance and provide high quality intelligent services on mobile 
devices applicationss. It is mainly applied to image and voice processing at can be empower more to make 
people’s daily life more convenient [14]. Deep learning approach such as CNN model is knownly a method 
that provides high accuracy in image classification [15]. It provides numerical results between 0 and 1 which 
obtained faster and higher accuracy values for classification purposes. 

This paper proposed mobile application through deep learning approach specifically by convolution 
neural network (CNN) that might help the visually impaired on their daily lives. The training set for CNN is 
developed at Google Dataset which is developed by Google for Big Data Analytics. CNN needs the cloud 
storage that able to analyse the big data analysis for decision making, classification, prediction with high 
accuracy [16]. The proposed application doesn’t require internet connection to operates and it consists of 
three types of detection as mentioned in Methodology section. The developed application allows the visually 
impaired to capture things around them with their own smartphone and will help them recognize the captured 
object. This will make their life easier without depending to the people around them who somethimes 
insincere and just take advantage on them. 

This paper is organized as: section 2 describes the method used to develop the application. The 
results and discussion will be covered in section 3 and finally in section 4 will conclude and mentions the 
future recommendation. 


2. RESEARCH METHOD 

2.1. Overview of the application 

In the phase of android mobile application development, the various useful assistant are combined in 
single application as: 

a. Object detection: it works on the image captured by the mobile phone’s camera. It will be trained with 
the database objects to identify the image. It helps the person with visually impairment to find their 
items. 

b. Colour detection: it works on the image taken which the colour name is based on the RGB values of the 
detected image. This feature may help in their daily routing such as cloth colour and shoe selection. 

c. Currency note detection: The image taken from the camera will be compared with the trained dataset for 
recognition. This will help visually impaired people from being cheated by others. 

The features of the application are associated with audio output of verbal message for user 
notification as it is important for visually impaired person to identify the object [17]-[19]. Overal, this 
application features include: 

a. Android based mobile application that can be accessed at anytime and anymwhere without internet 
connection. This is to ensure to secure the user personal data from third party [20], [21]. 

b. The input from user can be swipe gesture, and speech input which is specially designed for visually 
impaired for easy usage [22]. 

c. Incorporated with deep learning, CNN for fast and accurate image processing for detection and 
prediction. 

d. The application consists of three different modes, object detection, colour detection and currency 
detection. The user can anaytime turning the mode by swiping or voice input. 

e. Incorporated with output of verbal message to notify the user on the identified objects. 
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Figure | shows the flowchart of the application on how it operates from starting the application is 
ready to use by the user. It started with the Main menu which consists of three different modes where the user 
can choose by click the menu or by the speech voice input. The camera will automatically on, and the user 
can capture the image for detection and prediction using CNN model. The application will notify the user 
through the audio output. 


Currency Not 
Recognition 


Figure 1. Flowchart of the android application system 


2.2. Application architecture 

The application is mainly developed by Android Studio and being configured by a TensorFlow 
Object Detection API for object detection and recognition by using CNN model. The CNN is adapted for 
data training approach which uses single shot detector (SSD) with a pre-trained model from MobileNet V2 
which developed at Google dataset. Basically, the image captured by the user’s mobile phone will be 
compared with the pre-trained image in database. The image loaded by the camera will be extracted and 
classified through the prediction prosess using TensorFlow algorithm. The application will notify the user 
through the audio output with the predicted image. Figure 2 shows the application architecture in general. 

TensorFlow is an end-to-end open source deep learning models developed by Google that can be 
deployed into a mobile or embedded devices. It has greatly easier model building with intuitive high level 
APIs which makes it for immediate easy debugging for any application. TensorFlow also provides a platform 
to excute machine learning algorithms which can be carried in wide range of heterogeneous systems from 
mobile phone to large scale of systems. The models can be train easily and accelerates the application of 
CNN due to the optimization of GPU usage [23]. 

The block diagram involved in image classification using the TensorFlow is shown in Figure 3. The 
data set are stored in IDX format with image and label information for both data test testing and training. 

In this paper, the data set contains about 1000 training image for 1 testing data. So grossly, there are 
almost 50,000 training image for 50 testing data with different situation of images. This is to make sure that 
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the image classification will produce high accuracy. In the second stage of designing the algorithm of CNN 
model using TensorFlow, the parameter with 10,000 iterations were processed with different epochs. The 
results of the training accuracy for 10 and 100 epochs are shown in section 4. In last stage for image 
classification performance, the different images were testing, and the accuracy percentage based on CNN 
image classification were calculated. 


TensorFlow 


Audio output 
pplication 


Figure 2. Application architecture 


Data set Creation of parameter and API of 


preparing CNN model Image clasification results 


Figure 3. Proposed block diagram of utilizing CNN by TensorFlow 


3. RESULTS AND DISCUSSION 
3.1. Object detection and recognition classifier training 

In prior of object detection and recognition development, the classifier training accuracy analysis is 
a must step to make sure that the application provides high accuracy percentage. About 250 images for each 
class is collected with variety of backgrounds, orientation and conditions. The images will be trained with 
different epoch to see the training and validation accuracy. Figure 4 and Figure 5 show the training and 
accuracy result using 10 and 100 epochs respectively. 

As can be seen from both Figure 4 and Figure 5, the highest value of accuracy is 1.0 while the 
highest value for validation is 0.717 at 100 epochs. Based on the results, it clearly shows that the more 
epochs are, the highest value of validation accuracy can be achieved. 


Training and Validation Accuracy 
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Figure 4. The training and validation accuracy based on 10 epochs 
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Figure 5. The training and validation accuracy based on 100 epochs 


3.2. Application testing 

The application is design with simple interface to make it user friendly for visually impaired. It 
contains with 3 menu buttons, currency note detector, colour detector, and explore things as object detector 
as shown in Figure 6. The menu can be selected by touching the screen or use talkback function. 


Object detector 


Currency note 
detector 


Colour detector 


Figure 6. User interface design of android application 


The application is tested in real time with currency note and colour detection. The results with the 
accuracy percentage are presented in Table 1 and Table 2. Based on the both tables, the detection accuracy is 
relatively good with high confidence and accuracy level. The application is able to detect the image captured 
with high accuracy. 
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Table 1. Results of currency note detection 


Image captured Accuracy (%) Result 
fn 7 99.73 Ten Ringgit Malaysia (RM 10) 
True 
97.92 One Ringgit Malaysia (RM1) 
True 
98.88 Twenty Ringgit Malaysia (RM20) 
True 


tere ypteae =m 
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Table 2. Results of colour detection 


Image captured Accuracy (%) Result 
. l 99.95 Red Colour 
True 
95.07 Blue Colour 
True 
70.74 Black Colour 
True 


3.3. Site testing implementation 


O 885 


The site testing application is implemented with the real subject, the people with visually impaired 
at the reflexology centre in Melaka Mall. Before they use the application, the demonstration with explanation 
was done. As they are totally cannot see, the talkback function is used as the input for menu selection. 


Figure 7 shows the photo of the subject use the application by themselves. 


The feedback from them was taken after the testing. They are very excited with the application as 
the application is able to detect things in front of them accurately and effectively. The feature in this 
application are really needed by them and hoping that they can be more accessible to their surroundings. 
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Figure 7. The photo taken during the site testing implementation 


3.4. Comparison to the other application 
B. S. Lin et al. has proposed computer image recognition as guided system which recognize multicle 
obstacels in every image using CNN algorithm. The accuracy of recognition is reached only 60% [24]. Other 
applications presented by Anitha which focus on real time object detection has achieved upto 92.16%. It is 
quite efficient with that number of image classification accuracy percentage [25]. However, compared to 
other applications, the proposed application in this paper is considered has higher accuracy in image 
classification. The accuracy of the proposed application is up to 99.95%. The novelty of this application 
compared to the existing are: 
a. An assistive mobile application to help make the world more accessible to the visually impaired people 
which using deep learning approach to process image from phone’s camera to do image classification 
b. This application is able to give fast and more accurate results because it is using on device image 
recognition from deep learning approach 
c. As it using on device image recognition, no internet connection is required. So the user can use the 
application at anywhere and anytime 
d. Additionally, with no internet connection required the personal data is safe and secure. There will be no 
3" party involves in retrieving the data 
e. The application is user friendly. The application only consists of simple interface with no distraction 
from complicated settings so that it can be easily been used by the visually impaired 


4. CONCLUSION 

This paper has presented the development of android application for people with visually 
impairment with some novelty compare to the existing application. Based on the results, the application is 
able to predict the image captured by the user with high accuracy up to 99.95%. On top of that, the site 
testing with visually impaired people has been done with positive feedback received. As for future work, the 
additional feature can be added as well as the inclusion of internet of things (loT) for more advance 
application. 
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